Robust classification for skewed data

نویسندگان

  • Mia Hubert
  • Stephan Van der Veeken
چکیده

In this paper we propose a robust classification rule for skewed distributions. For low dimensional data, the classification is based on the adjusted outlyingness. In the case of high dimensional data, the robustified SIMCA method is adjusted for skewness. The robustness of the method is investigated through different simulations and by applying it to various real examples.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Intelligent and Robust Genetic Algorithm Based Classifier

The concepts of robust classification and intelligently controlling the search process of genetic algorithm (GA) are introduced and integrated with a conventional genetic classifier for development of a new version of it, which is called Intelligent and Robust GA-classifier (IRGA-classifier). It can efficiently approximate the decision hyperplanes in the feature space. It is shown experime...

متن کامل

Robustified distance based fuzzy membership function for support vector machine classification

Fuzzification of support vector machine has been utilized to deal with outlier and noise problem. This importance is achieved, by the means of fuzzy membership function, which is generally built based on the distance of the points to the class centroid. The focus of this research is twofold. Firstly, by taking the advantage of robust statistics in the fuzzy SVM, more emphasis on reducing the im...

متن کامل

Robust PCA for skewed data and its outlier map

The outlier sensitivity of classical principal component analysis (PCA) has spurred the development of robust techniques. Existing robust PCA methods like ROBPCA work best if the non-outlying data have an approximately symmetric distribution. When the original variables are skewed, too many points tend to be flagged as outlying. A robust PCA method is developed which is also suitable for skewed...

متن کامل

Palarimetric Synthetic Aperture Radar Image Classification using Bag of Visual Words Algorithm

Land cover is defined as the physical material of the surface of the earth, including different vegetation covers, bare soil, water surface, various urban areas, etc. Land cover and its changes are very important and influential on the Earth and life of living organisms, especially human beings. Land cover change monitoring is important for protecting the ecosystem, forests, farmland, open spac...

متن کامل

A Quadratic Mean based Supervised Learning Model for Managing Data Skewness

In this paper, we study the problem of data skewness. A data set is skewed/imbalanced if its dependent variable is asymmetrically distributed. Dealing with skewed data sets has been identified as one of the ten most challenging problems in data mining research. We address the problem of class skewness for supervised learning models which are based on optimizing a regularized empirical risk func...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Adv. Data Analysis and Classification

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2010